Research Issues in Supporting Data Intensive Applications within an Exascale System

نویسندگان

  • Abhirup Chakraborty
  • Dilma Da Silva
چکیده

Analyzing large graphs are crucial to a variety of applications domains, like personalized recommendations in social networks, search engines, communication networks, computational biology, etc. In these domains, there is a need to process aggregation queries over large graphs. Existing approaches for aggregation are not suitable for large graphs, as they involve multi-way relational over gigantic tables or repeated multiplication of large matrices. In this report, we consider the top-K aggregation queries that involve identifying top-K nodes with highest aggregate values over their h-hop neighbors. We propose algorithms for processing such queries over large graphs in a shared nothing environment. We propose a hybrid algorithm that minimizes network loads is shuffling data across the processing nodes. The algorithm partitions the graph across the processing nodes, and uses Floyd-Warshall algorithm within each nodes. The nodes shuffles updates among themselves in iterative phases; such incremental iterative processing is similar to route discover in a large network. The algorithm needs only a few iterations to converge to an equilibrium state.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Simulation of Terabit Data Flows for Exascale Applications

Scientific workflows are increasingly drawing attention as both data and compute resources are getting bigger, heterogeneous, and distributed. Many science workflows are both compute and data intensive and use distributed resources. This situation poses significant challenges in terms of real-time remote analysis and dissemination of massive datasets to scientists across the community. These ch...

متن کامل

FusionFS: a distributed file system for large scale data-intensive computing

Today’s science is generating datasets that are increasing exponentially in both complexity and volume, making their analysis, archival, and sharing one of the grand challenges of the 21st century. Exascale computing, i.e. 10 FLOPS, is predicted to emerge by 2019 with current trends. Millions of nodes and billions of threads of execution, producing similarly large concurrent data accesses, are ...

متن کامل

Towards Supporting Data-Intensive Scientific Applications on Extreme-Scale High-Performance Computing Systems

Many believe that the state-of-the-art yet decades old high-performance computing (HPC) storage would not meet the I/O requirement of the emerging exascale mainly due to the segregation of compute and storage resources. Indeed, our simulation predicts, quantitatively, that the efficiency and availability would go towards zero as the system scales approach exascale. This work proposes a new arch...

متن کامل

The International Exascale Software Project: a Call To Cooperative Action By the Global High-Performance Community

When processor clock speeds flatlined in 2004, after more than 15 years of exponential increases, the computational science community lost the key to the automatic performance improvements its applications had traditionally enjoyed. Subsequent developments in processor and system design — hundreds of thousands of nodes, millions of cores, reduced bandwidth and memory available to cores, inclusi...

متن کامل

ESSEX: Equipping Sparse Solvers for Exascale

The ESSEX project investigates computational issues arising at exascale for large-scale sparse eigenvalue problems and develops programming concepts and numerical methods for their solution. The project pursues a coherent co-design of all software layers where a holistic performance engineering process guides code development across the classic boundaries of application, numerical method, and b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011